Electronic device and method for wireless communication

ABSTRACT

The present disclosure provides an electronic device and method for wireless communication. The electronic device comprises: a processing circuit configured to determine a collaborative access point set for users within a predetermined range by using a wireless network topology of a wireless network as a state, and to redetermine the collaborative access point set for the users in response to a change in the wireless network topology, wherein the wireless network topology comprises a user distribution and an access point distribution.

The present application claims priority to Chinese Patent ApplicationNo. 201711009075.6, titled “ELECTRONIC DEVICE AND METHOD FOR WIRELESSCOMMUNICATION”, filed on Oct. 25, 2017 with the the China NationalIntellectual Property Administration, which is incorporated herein byreference in its entirety.

FIELD

The present disclosure generally relates to the field of wirelesscommunications, and in particularly to resource management in a UserCentric Network (UCN), and more particularly to an electronic apparatusand a method for wireless communications.

BACKGROUND

With the rapid development of communication networks, mobile data raterequirements of users are exponentially increased. In addition, themobility, flexible configuration and the like of the apparatus alsobring challenges to future wireless networks. An ultra-dense network(UDN) involving deployment of microminiaturized base stations becomes aneffective technology for addressing the growing mobile data raterequirements. Since the small base stations are densely and flexiblyconfigured, it becomes possible to implement the user centric network(UCN), so as to support effective communication of massive amounts ofmobile users and devices. The UCN allows each user to select multipleaccess points such as base stations in a joint manner, to performcoordinated transmission, thereby meeting requirements for quality ofservice of all users at a largest probability. Therefore, a user-centricultra-dense network (UUDN) will become the main tendency of futurenetworks.

In addition, with the development of artificial intelligence andInternet of Things, artificial intelligence methods such as machinelearning become one of the focuses of recent researches. The wirelessnetwork emulates the mode of human thinking, so that the resourcemanagement becomes more intelligent.

SUMMARY

In the following, an overview of the present disclosure is given simplyto provide basic understanding to some aspects of the presentdisclosure. It should be understood that this overview is not anexhaustive overview of the present disclosure. It is not intended todetermine a critical part or an important part of the presentdisclosure, nor to limit the scope of the present disclosure. An objectof the overview is only to give some concepts in a simplified manner,which serves as a preface of a more detailed description describedlater.

According to an aspect of the present disclosure, an electronicapparatus for wireless communications is provided. The electronicapparatus includes processing circuitry. The processing circuitry isconfigured to: determine a coordination access point group for a userwithin a predetermined range, by taking a wireless network topologystructure of a wireless network as a state; and re-determine acoordination access point group for the user in response to a change ofthe wireless network topology structure, wherein the wireless networktopology structure comprises a distribution of users and a distributionof access points.

According to another aspect of the present disclosure, a method forwireless communications is provided. The method includes: determining acoordination access point group for a user within a predetermined range,by taking a wireless network topology structure of a wireless network asa state; and re-determining a coordination access point group for theuser in response to a change of the wireless network topology structure,wherein the wireless network topology structure comprises a distributionof users and a distribution of access points.

According to other aspect of the present disclosure, there are furtherprovided computer program codes and computer program products forimplementing the above methods, as well as a computer-readable storagemedium having recorded thereon the computer program codes forimplementing the methods described above.

With the electronic apparatus and the method according to the presentdisclosure, the coordination access point group (APG) can be dynamicallyselected, thereby meeting the communication requirements of all users ina better way.

These and other advantages of the present disclosure will be moreapparent by illustrating in detail a preferred embodiment of the presentdisclosure in conjunction with accompanying drawings below.

BRIEF DESCRIPTION OF THE DRAWINGS

To further set forth the above and other advantages and features of thepresent disclosure, detailed description will be made in the followingtaken in conjunction with accompanying drawings in which identical orlike reference signs designate identical or like components. Theaccompanying drawings, together with the detailed description below, areincorporated into and form a part of the specification. It should benoted that the accompanying drawings only illustrate, by way of example,typical embodiments of the present disclosure and should not beconstrued as a limitation to the scope of the disclosure. In theaccompanying drawings:

FIG. 1 is a schematic diagram showing a scenario of the UUDN;

FIG. 2 is a block diagram showing function modules of an electronicapparatus for wireless communications according to an embodiment of thepresent disclosure;

FIG. 3 is a graph of an example of a utility function;

FIG. 4 is a block diagram showing function modules of an electronicapparatus for wireless communications according to an embodiment of thepresent disclosure;

FIG. 5 is a block diagram showing function modules of an electronicapparatus for wireless communications according to another embodiment ofthe present disclosure;

FIG. 6 is a block diagram showing function modules of an electronicapparatus for wireless communications according to another embodiment ofthe present disclosure;

FIG. 7 is a block diagram showing function modules of an electronicapparatus for wireless communications according to another embodiment ofthe present disclosure;

FIG. 8 is a schematic diagram showing an information procedure between auser, an access point and a spectrum management device;

FIG. 9 is a schematic diagram showing a simulation scenario of asimulation instance;

FIG. 10 is a diagram showing an example of an action matrix and aQ-value matrix;

FIG. 11 is a schematic diagram showing a result after a determinedaction is performed;

FIG. 12 is a diagram showing another example of the action matrix andthe Q-value matrix;

FIG. 13 is a schematic diagram showing a result after a determinedaction is performed;

FIG. 14 is a schematic diagram showing a simulation scenario 1 ofanother simulation instance;

FIG. 15 is a schematic diagram showing a simulation scenario 2 ofanother simulation instance;

FIG. 16 is a comparison diagram of a cumulative distribution function(CDF) of a user satisfaction rate obtained based on the simulationscenario 1;

FIG. 17 is a graph showing ratios of meeting communication qualityrequirement of a user in a case that the user moves along a rectangulartrace in the simulation scenario 2 with different numbers of rounds;

FIG. 18 is a flowchart of a method for wireless communications accordingto an embodiment of the present disclosure;

FIG. 19 is a block diagram showing an example of a schematicconfiguration of a server 700 to which the technology of the presentdisclosure may be applied; and

FIG. 20 is a block diagram of an exemplary block diagram illustratingthe structure of a general purpose personal computer capable ofrealizing the method and/or device and/or system according to theembodiments of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS

An exemplary embodiment of the present disclosure will be describedhereinafter in conjunction with the accompanying drawings. For thepurpose of conciseness and clarity, not all features of an embodimentare described in this specification. However, it should be understoodthat multiple decisions specific to the embodiment have to be made in aprocess of developing any such embodiment to realize a particular objectof a developer, for example, conforming to those constraints related toa system and a business, and these constraints may change as theembodiments differs. Furthermore, it should also be understood thatalthough the development work may be very complicated andtime-consuming, for those skilled in the art benefiting from the presentdisclosure, such development work is only a routine task.

Here, it should also be noted that in order to avoid obscuring thepresent disclosure due to unnecessary details, only a device structureand/or processing steps closely related to the solution according to thepresent disclosure are illustrated in the accompanying drawing, andother details having little relationship to the present disclosure areomitted.

First Embodiment

FIG. 1 is a schematic diagram showing a scenario of the UUDN. There aremultiple access points (AP) around user equipment (UE, which is alsoreferred to as a user hereinafter). The UE performs coordinationtransmission by using different APs. Further, each of the APs iscommunicatively connected to a spectrum management device such as aspectrum coordinator (SC). The SC determines a coordination APG for theUE within a management range of the SC. The coordination APG is a groupof APs having coordination relationships with the corresponding UE, thatis, a group of Aps providing communication access services to the UE.Further, a local SC may properly communicate with an adjacent SC, so asto interchange information. It can be seen that, compared with aconventional cellular network architecture, the network architectureshown in FIG. 1 is characterized in the great number of APs, which iseven greater than the number of the UE.

The AP described herein may be any node which provides networkcommunication service, such as a base station, a small base station orthe like. The base station may be implemented as any type of evolvednode B (eNB), such as a macro eNB and a small eNB.

The small eNB may be an eNB such as a pico eNB, a micro eNB, and a home(femto) eNB that covers a cell smaller than a macro cell. Instead, thebase station may be realized as any other types of base stations such asa NodeB and a base transceiver station (BTS). The base station mayinclude a main body (that is also referred to as a base stationapparatus) configured to control wireless communication, and one or moreremote radio heads (RRH) disposed in a different place from the mainbody. In addition, various types of terminals may each operate as thebase station by temporarily or semi-persistently executing a basestation function.

The UE or the user may be any wireless communication device providingservice or any terminal device. For example, the terminal device may beimplemented as a mobile terminal (such as a smart phone, a tabletpersonal computer (PC), a notebook PC, a portable game terminal, aportable/dongle mobile router and a digital camera) or an in-vehicleterminal (such as a car navigation device). The terminal device may alsobe implemented as a terminal (that is also referred to as a machine typecommunication (MTC) terminal) that performs machine-to-machine (M2M)communication. In addition, the terminal device may be a wirelesscommunication module (such as an integrated circuit module including asingle die) installed on each of the above terminals.

In addition, the SC shown in FIG. 1 is just an example of the spectrummanagement device, other types of the spectrum management device such asa spectrum access system (SAS) may be adopted, which are notrestrictive.

In the scenario shown in FIG. 1, the user and even the access point bothmay be in a mobile state. Thus, a coordination APG being dynamicallyselected for the user is advantageous to maintain stable communicationof high quality. In view of this, an electronic apparatus 100 forwireless communications is provided according to an embodiment of thepresent disclosure. As shown in FIG. 2, the electronic apparatus 100includes a determining unit 101 and an updating unit 102. Thedetermining unit 101 is configured to determine a coordination accesspoint group (APG) for a user within a predetermined range, by taking awireless network topology structure of a wireless network as a state.The updating unit 102 is configured to re-determine a coordinationaccess point group for the user in response to a change of the wirelessnetwork topology structure.

The determining unit 101 and the updating unit 102 each may beimplemented by one or more processing circuitries. The processingcircuitry, for example, may be implemented as a chip. The electronicapparatus 100, for example, may be located on the spectrum managementdevice (such as the SC or the SAS) shown in FIG. 1. Alternatively, theelectronic apparatus 100 may be communicatively connected to thespectrum management device.

In this embodiment, the electronic apparatus 100 may determine thecoordination APG for the user within the predetermined range by using areinforcement learning algorithm. The predetermined range, for example,may be at least a part of a management range of the spectrum managementdevice on which the electronic apparatus is located.

In the reinforcement learning algorithm, learning is regarded as aprocess of exploring evaluation, to learn a map from an environmentstate to an action, so that a selected action can obtain a maximum awardfrom the environment, that is, so that an external environment evaluatesa learning system in some sense as optimum (or operation performance ofa whole system is optimum). The reinforcement learning algorithm usedherein, for example, may include a Q-learning algorithm, a differencelearning algorithm or the like. The wireless network topology structuremay be taken as a state.

In an example, the wireless network topology structure includes adistribution of users and a distribution of access points. In otherwords, in a case that the users and/or the access points move, or on-offstates of specific users and/or access points change, the wirelessnetwork topology structure changes. As shown in FIG. 1, in a case thatthe UE moves in a direction from bottom to top, as indicated by theblack dashed line with arrow, the wireless network topology structurechanges, for example, corresponding to states St, S_(t+1), and S_(t+2)shown in FIG. 1. In this case, the coordination APG determined for theuser in a previous state may be not applicable in a new state, forexample, may not meet communication requirements of the user. Therefore,the updating unit 102 re-determines a coordination APG for the user inresponse to a change of the wireless network topology structure, so asto provide stable and continuous communication service to the user.

In an example, the change of the wireless network topology structureincludes a change of a position of the user. The change is detected bythe user. When detecting the change, the user reports the change to theelectronic apparatus 100 and requests the electronic apparatus 100 tore-determine a coordination APG for the user. In other examples, forexample, the change of the wireless network topology structure furtherincludes a change of a position of the access point. The access pointalso reports the change of the position of the access point to theelectronic apparatus 100. Correspondingly, the electronic apparatus 100may re-determine a coordination APG for the user based on the change.

For example, the determining unit 101 may take a coordinationrelationship between the user and the access point as an action in thereinforcement learning algorithm, and with respect to each action,calculate an evaluation of the action based on a degree of meetingcommunication quality requirement of the user and a resulting networkoverhead when performing the action. Generally, the user has specificrequirements for communication quality of the user. When performing anaction, the degree of meeting communication quality requirement of theuser indicates one aspect of the evaluation of the action. Thecommunication quality requirement of the user may be represented by, forexample, quality of service (QoS) required by the user. As described inthe following, the communication quality requirement of the user may berepresented by a signal to interference and noise ratio (SINR)threshold. It should be noted that, this is only an example and is notrestrictive.

In addition, when the previous state changes to a current state, theaction changes correspondingly. For example, an action determined in theprevious state changes to another action. The change of the actionindicates the change of the coordination APG of the UE, resulting inswitching between APs, which incurs the network overhead. In terms ofthe evaluation of the action, the network overhead is expected to be assmall as possible. Therefore, the network overhead indicates anotheraspect of the evaluation of the action.

In an example, the determining unit 101 determines the coordination APGfor the user in the current state based on an action with the highestevaluation. In other words, the determining unit 101 determines theaction with the highest evaluation as an action to be performed, so asto determine coordination APGs for respective users. For example, theaction with the highest evaluation is an action when being performedresults in the highest degree of meeting the communication qualityrequirement of the user and the lowest network overhead, compared withother actions.

For convenience of understanding, aspects of the embodiment aredescribed with the Q-learning algorithm as an example in the following.However, it should be noted that this is not restrictive, otherreinforcement learning algorithms are also applicable in the presentdisclosure.

It is assumed that there are N users and M APs within the predeterminedrange, coordination relationships between the users and the accesspoints, that is, actions (which are also referred to as individuals) inthe reinforcement learning algorithm may be expressed by a followingmatrix in the equation (1):

$\begin{matrix}{A_{i}^{\prime} = \begin{bmatrix}a_{11} & a_{12} & \ldots & a_{1M} \\a_{21} & a_{22} & \ldots & a_{2\; M} \\\vdots & \vdots & \ddots & \vdots \\a_{N\; 1} & a_{N2} & \ldots & a_{NM}\end{bmatrix}_{N \times M}} & (1)\end{matrix}$

where a_(n,m) (n=1, . . . , N; m=1, . . . , M) denotes a coordinationrelationship between an n-th user and an m-th AP. For example, in a casethat a_(n,m) is equal to 1, it is indicated that there is a coordinationrelationship between the n-th user and the m-th AP. In a case thata_(n,m) is equal to 0, it is indicated that there is no coordinationrelationship between the n-th user and the m-th AP.

For convenience of operation, equation (1) may be transformed into avector represented by equation (2).

A_(i)=[a₁₁a₁₂ . . . a_(1M)a₂₁a₂₂ . . . a_(2M) . . . a_(N1)a_(N2) . . .a_(NM)]_(1×NM)   (2)

That is, rows in the equation (1) are rearranged in one row. In a caseof there being multiple actions, each of the multiple actions is takenas one row, to form an action matrix.

First, multiple actions, that is, multiple A_(i)s having differentvalues, are initially generated for a state, such as the state S_(t).For example, the generated actions may be defined by setting apredetermined condition. The predetermined condition may include, forexample, one or more of: the generated action causes the communicationquality for each user to meet the communication quality requirement ofthe user; and the network overhead produced when using this coordinationrelationship relative to an action determined in the previous state doesnot exceed a predetermined overhead threshold. For example, thecommunication quality requirement may be expressed by the SINRthreshold.

As described in the above, a degree of meeting communication qualityrequirement of each user and a resulting network overhead whenperforming each action in the state are taken as the evaluation of theaction. In the Q-learning algorithm, the evaluation of the action isexpressed by a Q-value. Evaluations of actions form a Q-value matrix.

In an example, the determining unit 101 may calculate the degree ofmeeting communication quality requirement of each user using an SINRthreshold for the user and an estimated SINR of the user. The estimatedSINR of a user being closer to the SINR threshold for the user indicatesa higher degree of meeting communication quality requirement of theuser. For example, the determining unit 101 may take degrees of meetingcommunication quality requirement of the users into considerationcomprehensively.

In an example, the degree of meeting communication quality requirementof the user includes a utility value of all users and a cost value ofnot meeting the SINR of the user. The utility value of the user iscalculated from a utility function. The utility function is a non-linearfunction of a ratio of the estimated SINR of the user to the SINRthreshold for the user. The cost value depends on a difference betweenthe SINR threshold of a user and the estimated SINR of the user. Theutility value is used to represent a degree of satisfaction of theestimated SINR of the user relative to the SINR threshold. The costvalue is used to denote a degree of dissatisfaction of the estimatedSINR of the user relative to the SINR threshold.

For example, when performing an action A_(i) in the state S_(t), thedegree of meeting communication quality requirement of the user R(S_(i),A_(i)) may be calculated by using the following equation (3):

$\begin{matrix}{{R( {S_{t},A_{i}} )} = {{\prod\limits_{n}^{N}U_{n}} - {\sigma {\sum\limits_{n}^{N}\lbrack {\max \{ {0,{{SINR}_{n}^{th} - {SINR}_{n}}} \}} \rbrack^{2}}}}} & (3)\end{matrix}$

where U_(n) denotes a utility value of an n-th user, which is calculatedfrom a utility function of the user, for example, by using the followingequation (4), where σ denotes a cost factor, SINR_(n) ^(th) denotes aSINR threshold for the n-th user, and SINR_(n) denotes an estimated SINRof the n-th user.

$\begin{matrix}{U_{n} = {{f_{n}( {{SINR}_{n},{SINR}_{n}^{th}} )} = {\frac{1}{2} \times \{ {{\tanh \{ {\xi \times ( {\frac{{SINR}_{n}}{{SINR}_{n}^{th}} - \eta} )} \}} + 1} \}}}} & (4)\end{matrix}$

In the above equation, tanh ( ) denotes a hyperbolic tangent function, ξdenotes an extension factor (for example, which may be equal to 3.5834),and η denotes a symmetric center (for example, which may be equal to0.8064). FIG. 3 shows a curve of an example of the utility function. Asshown in FIG. 3, in a case that the SINR of the user is greater than theSINR threshold for the user, the curve of the utility function ischanges relatively slowly and approximates 1, so as to avoid anoverlarge R value due to an over-high SINR of a user. It should beunderstood that the utility function is not limited to the formexpressed by the equation (4), but may be modified properly.

In the above calculation, SINR_(n) ^(th), for example, may be providedby the user. SINR_(n) may be estimated by various communication systemmodels. In an example, SINR_(n) may be calculated by using the followingequation:

$\begin{matrix}{{SINR}_{n} = \frac{\sum\limits_{j \in \Phi_{C{(n)}}}^{\;}{p_{j}( d_{nj} )}^{- \alpha}}{{\sum\limits_{k \in \Phi_{I{(n)}}}^{\;}{p_{k}( d_{nk} )}^{- \alpha}} + n_{0}}} & (5)\end{matrix}$

where p_(j) and p_(k) denote power of a j-th AP and power of a k-th APrespectively, d_(nj) denotes a distance between the n-th user and thej-th AP , d_(nk) denote a distance between the n-th user and the k-thAP, α denotes a path loss factor, Φ_(c(n)) denotes a coordination APGfor the n-th user, Φ_(l(n)) denotes an interference APG for the n-thuser, n₀ denotes a noise power at a receiver of the user, and theinterference APG indicates a group of APs interfering with the focusedn-th user when providing communication access services to other users.

As shown in the above equation (3) to equation (5), the determining unit101 calculates the degree of meeting communication quality requirementof the user. In the Q-learning algorithm, the degree of meetingcommunication quality requirement of the user is equivalent to a bonus.Position information of the user, position information and emittingpower of the access point, and the communication quality requirement ofthe user such as the SINR threshold, are used in the above calculation.

In addition, the determining unit 101 may be further configured to use,with respect to each action, a difference between this action and anaction determined in a previous state as the network overhead producedby this action. For example, in a case that the determining unit 101determines the action with the highest evaluation as the action to beperformed, the action determined in the previous state is an action withthe highest evaluation in the previous state. In a case that the currentstate is an initial state, that is, there is no previous state, thenetwork overhead may be set to be zero.

In an example, the determining unit 101 may use operation amount forperforming network switching operation when performing an action, ascompared with the action determined in the previous state, as a networkoverhead produced by the action.

As described in the above, the action may be represented by abinarization matrix of the coordination relationship. In this case, thenetwork overhead may be represented by a

Hamming distance between actions, as expressed by the following equation(6). In practice, in a case that an action is represented by 0 or 1, theHamming distance between actions physically means the number of theswitched coordination APs between two APG options. In the Q-learningalgorithm, the network overhead is equivalent to the cost value.

$\begin{matrix}{{P{H( {S_{t},A_{i}} )}} = {{- \sigma}{\sum\limits_{n}^{N}\lbrack {D_{ham}( {A_{i},A_{S_{t - 1}}} )} \rbrack^{2}}}} & (6)\end{matrix}$

where A_(S) _(t−1) denotes an action determined to be performed in theprevious state S_(t−1), denotes a cost factor, and D_(ham)( ) denotes acalculation of the Hamming distance. As described in the above, in acase that the state S_(t) is the initial state, PH(SS_(t), A_(t)) may beset to be zero.

In another example, the network overhead produced when performing theaction is taken into consideration only when the network overheadexceeds a predetermined overhead threshold. In this case, the networkoverhead may be calculated from the following equation (7):

$\begin{matrix}{{P{H( {S_{t},A_{i}} )}} = {{- \sigma}{\sum\limits_{n}^{N}\lbrack {\max \{ {0,{{D_{ham}( {A_{i},A_{S_{t - 1}}} )} - T_{d}}} \}} \rbrack^{2}}}} & (7)\end{matrix}$

where T_(d) denotes a predetermined network overhead threshold, that is,a predetermined Hamming distance threshold. As shown in the equation(7), the network overhead is calculated only when the Hamming distancebetween an action A_(t) and an action A_(S) _(t−1) , is greater than thepredetermined network overhead threshold T_(d). Otherwise, the networkoverhead is set to be zero. The predetermined network overhead thresholdis used in this calculation. The predetermined network overheadthreshold may be provided by the AP.

By combining the above equations (3) and (7), the evaluation of theaction may be calculated as follows, so as to obtain a Q-value matrixQ(S_(t)) in the state S_(t). Elements of the Q-value matrix Q(S_(t)) arecalculated as follows:

Q(S _(t) , A _(i))=R(S _(t) , A _(i))+PH(S _(t) , A _(i))   (8)

where the Q-value matrix Q(S_(t)) is a matrix with a dimension of T×1,and T denotes the number of actions. Based on the obtained Q-valuematrix Q(S_(t)), for example, an action corresponding to the largestQ-value, that is, an action with the highest evaluation may bedetermined as a selected result of the APG in the state S_(t). In thiscase, the communication quality requirement of each user is met as muchas possible, and the network overhead produced by switching the AP isreduced.

It should be understood that the above calculation for selecting the APGmay be performed online, offline, or in a manner of combination of theonline and the offline.

As shown in FIG. 4, the electronic apparatus 100 may further include astorage unit 103. The storage unit 103 is configured to store, withrespect to each state, each action in the state in association with anevaluation calculated with respect to the action as an evaluationmatrix.

The storage unit 103 may be implemented by various storages. Theevaluation, for example, may include two aspects of the above describeddegree of meeting communication quality requirement of the user (forexample, R(S_(t), A_(i))) and the network overhead produced byperforming the action (for example, PH(S_(t), A_(i))).

It should be understood that after the evaluation matrix is created, theupdating unit 102 may be configured to determine, when the state changesand in a case that there is an evaluation matrix for the changed state,an action to be performed in the changed state based on content of theevaluation matrix. Specifically, an action suitable for the currentstate, for example, the action with the highest evaluation, may beselected based on the current state.

After the action is selected, the coordination relationship between theUE and the AP is determined correspondingly. In this way, calculationload can be reduced, processing speed can be increased, and the APG canbe switched quickly and stably while the user is in the mobile state.

In another aspect, in a case of no evaluation matrix for the changedstate, an evaluation matrix is created for the changed state asdescribed in the above.

In addition, the updating unit 102 may be further configured to update,when the state changes, an evaluation of the action performed in theprevious state which is stored in the storage unit 103 using informationof actual communication quality of the user when performing thedetermined action in the previous state. The actual communicationquality of the user is acquired by measuring by the user.

For example, the updating unit 102 may replace the stored degree ofmeeting communication quality requirement that is obtained by estimationwith the degree of meeting communication quality requirement that iscalculated based on the actual communication quality of the user. In acase that the state changes from the state S_(t) to the state S_(t+1)and the action A_(i) is determined in the state S_(t), the updating unit102, for example, may replace the stored R(S_(t), A_(i)) with thefollowing equation (9):

$\begin{matrix}{R_{t + 1} = {{\prod\limits_{n}^{N}U_{n}} - {\sigma {\sum\limits_{n}^{N}\lbrack {\max \{ {0,{{SINR}_{n}^{th} - {SINR}_{n}^{actual}}} \}} \rbrack^{2}}}}} & (9)\end{matrix}$

where SINR_(n) ^(actual) denotes an actual SINR of an n-th user, whichis also used when calculating U_(n) in equation (9). For example,SINR_(n) ^(actual) is a numerator in the tanh function when calculatingU_(n) by using equation (4).

The evaluation matrix is updated based on the information of the actualcommunication quality. In a case that actual communication qualitycorresponding to an action determined in a state is poor, the previouslyselected action would not be selected when returning to this stateafterwards, thereby improving the communication quality.

In another example, correlation between the changed state, that is, thecurrent state, and the previous state may also be taken intoconsideration when updating the evaluation matrix. For example, theupdating unit 102 is configured to replace a portion of the evaluationof the action performed in the previous state which is related to thedegree of meeting the communication quality requirement of the user witha following calculated value: a weighted sum of the actual degree ofmeeting the communication quality requirement of the user in theprevious state and the estimated highest degree of meeting thecommunication quality requirement of the user in the current state.

For example, in the case that the state changes from the state S_(t) tothe state S_(t+1) and the action A_(t) is determined in the state S_(t),the updating unit 102 may replace the stored R(S_(t), A_(t)) with thefollowing equation (10):

$\begin{matrix} {R( {S_{t},A_{i}} )}arrow{R_{t + 1^{+ \gamma}}{\max\limits_{A}{R( {S_{t + 1},A} )}}}  & (10)\end{matrix}$

where R_(t+1) is the same as that in the equation (9).

$\max\limits_{A}{R( {S_{t + 1},A} )}$

denotes that an action A is to be found in the state S_(t+1) , so thatR(S_(t+1), A) is the maximum among R values of all actions. γ is adiscount factor and denotes a degree of correlation between the previousstate and the current state. In a case of γ=0, it is indicated that theR value is only correlated with an R value in the previous state.

In addition, more generally, the wireless network topology structurebeing taken as the state may further include other variable parameters,such as one or more of: communication quality requirement of the UE,maximum emitting power of the AP, a predetermined network overheadthreshold of the AP, and the like. That is, changes of the parametersmay also cause the updating unit 102 to re-determine the APG, or updatethe stored evaluation of the action performed in the previous state.

In summary, the electronic apparatus 100 in this embodiment candetermine coordination APGs for different states by using thereinforcement learning algorithm, to dynamically select the APG, therebymeeting communication quality requirements of all users in a better way.Further, although the reinforcement learning algorithm is taken as anexample in the above description, the present disclosure is not limitedthereto, other algorithms may also be used to determine the coordinationAPG.

Second Embodiment

FIG. 5 is a block diagram showing function modules of an electronicapparatus 200 for wireless communications according to anotherembodiment of the present disclosure. Besides the units shown in FIG. 2,the electronic apparatus 200 may further include a grouping unit 201.The grouping unit 201 is configured to: in each state, acquire an actionby grouping the access points taking a user as a center and selectingthe coordination APG for the user within a group of the user.

Similarly, the grouping unit 201 may be implemented by one or moreprocessing circuitries. The processing circuitry, for example, may beimplemented as a chip. In addition, although not shown in FIG. 5, theelectronic apparatus 200 may further include the storage unit 103described with reference to FIG. 4.

For example, the grouping unit 201 may perform the grouping based on aEuclidean distance between the user and the access point. Asubordination parameter value of the access point to the user iscalculated by using the following equation (11):

$\begin{matrix}{{m_{ij} = \frac{1}{{x_{i} - u_{j}}}},{{{x_{i} - u_{j}}} = {{Euclidean}\mspace{14mu} {distance}}}} & (11)\end{matrix}$

where u_(j) denotes a j-th UE, and x_(i) denotes an i-th AP. A positionof the AP and a position of the UE in a wireless network vary indifferent states, and the subordination parameter value also varies indifferent states. A short Euclidean distance from the AP to the UEcorresponds to a large subordination parameter value. If a subordinationparameter value of the AP to a certain UE is large, the AP is assignedto the UE. In this way, a group for each UE is established.

The determining unit 101 randomly selects the coordination access pointgroup for the user within the group of the user, and takes thecoordination relationship between the user and the access point whichmeets a predetermined condition as the action. The predeterminedcondition, similar to that in the first embodiment, may include one ormore of: the communication quality for each user meets the communicationquality requirement of the user;

and the network overhead produced when using this coordinationrelationship relative to an action determined in the previous state doesnot exceed a predetermined overhead threshold.

A difference between this embodiment and the first embodiment lies inthat, the action in this embodiment is generated in a different manner.For example, in a case that the action is represented by thebinarization matrix, in this embodiment, an element corresponding to anAP outside of the group for the UE is set to be a value denoting nocoordination relationship (for example, zero).

Therefore, with the electronic apparatus 200 including the grouping unit201 in this embodiment, a selectable range of the coordination APs forthe user can be reduced, so as to easily determine a reasonable action,thereby improving selection accuracy and reducing calculation load.

Third Embodiment

FIG. 6 is a block diagram showing function modules of an electronicapparatus 300 for wireless communications according to anotherembodiment of the present disclosure.

Besides the units shown in FIG. 2, the electronic apparatus 300 furtherincludes an estimating unit 301. The estimating unit 301 is configuredto estimate, with respect to each state, a new action based on apreliminarily acquired action.

Similarly, the estimating unit 301 may be implemented by one or moreprocessing circuitries. The processing circuitry, for example, may beimplemented as a chip. In addition, although not shown in FIG. 6, theelectronic apparatus 300 may further includes the storage unit 103described with reference to FIG. 4, the grouping unit 201 described withreference to FIG. 5 and the like.

In the first embodiment and the second embodiment, the action ispreliminarily generated by randomly selecting the AP for the user. Inthis embodiment, in order to improve efficiency, the new action may beestimated further based on the preliminarily acquired actions.

For example, the estimating unit 301 may estimate the new action byusing a genetic algorithm (GA).

Specifically, the estimating unit 301 may select N_(p) actions havingbetter R values from among the preliminarily acquired actions to formoriginal populations of the genetic algorithm. A network fitness matrixof the original populations is calculated. The network fitness matrix ofthe populations is acquired based on a Q-value of each action, asexpressed by the following equation (12):

$\begin{matrix}{f_{P_{i}} = {f_{i} = \{ \begin{matrix}{Q( {S_{t},P_{i}} )} & {{Q( {S_{t},P_{i}} )} > 0} \\ \Deltaarrow 0  & {Others}\end{matrix} }} & (12)\end{matrix}$

where P_(i) denotes an i-th individual in the populations, that is, ani-th action, Δ denotes a value approximating zero, and Q(S₁, P₁) denotesa Q-value corresponding to P in the state S_(t).

Next, a selection operation is performed. For example, by using aroulette selection method, a probability that each individual appearsamong children is calculated based on a network fitness value of theindividual in the original populations, and N_(p) individuals arerandomly selected based on the probability to form a childrenpopulations. The probability p_(i) is calculated by using the followingequation (13):

$\begin{matrix}{p_{i} = {\frac{f_{i}}{\sum\limits_{j = 1}^{N_{p}}f_{j}}.}} & (13)\end{matrix}$

Next, a crossover operation is performed. Two individuals A_(m) andA_(n) are selected randomly from among the formed children populations.The crossover operation is performed on multiple points that areselected randomly, to form a new individual or populations. For example,the crossover operation performed on an i-th bit of an m-th individualA_(m) and an i-th bit of an n-th individual A_(n) are expressed as thefollowing equation (14):

$\begin{matrix}{\begin{matrix} {\overset{i}{{0\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 0}\;}1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 0}arrow{\overset{i}{{0\mspace{11mu} 0\mspace{11mu} 0\mspace{11mu} 0\mspace{11mu} 0}\;}1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 0\mspace{14mu} A_{m}}  \\ {1\mspace{11mu} 0\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 0}arrow{1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 1\mspace{11mu} 0\mspace{11mu} 0\mspace{14mu} {A_{n}.}} \end{matrix}\quad} & (14)\end{matrix}$

It should be noted that, the individuals shown in the equation (14) areexemplary, and the present disclosure is not limited thereto.

Next, a mutation operation is performed. An individual in thepopulations obtained by the crossover operation is selected randomly.The mutation operation is performed on a point randomly selected in theindividual, to generate a more excellent individual. Since a chromosomeof the individual is represented by 0 or 1, the mutation operation isperformed to change a chromosome represented by 0 into a chromosomerepresented by 1, or to change a chromosome represented by 1 into achromosome represented by 0. In this way, a new individual, that is, anew action is obtained.

The estimating unit 301 may repeat the selection operation, thecrossover operation and the mutation operation, so as to generatemultiple new actions. For example, the number of times for repeating theoperations may be set in advance.

In an example, the estimating unit 301 is further configured to take anaction estimated by using the genetic algorithm as a new action only ifthe action satisfies a predetermined condition. Similarly, thepredetermined condition may include one or more of: the communicationquality for each user meets the communication quality requirement of theuser; and the network overhead produced when using this action relativeto an action determined in the previous state does not exceed apredetermined overhead threshold.

The new action obtained in the above is added to preliminarily acquiredactions, to form a new action group. The determining unit 101 determinesan evaluation (for example, the Q-value in the first embodiment) of anaction by using the reinforcement learning algorithm, to select anaction with the highest evaluation as an action to be performed in thecurrent state, so as to determine the coordination APG for each user.

The electronic apparatus 300 in this embodiment obtains a new action byusing an estimation method such as the genetic algorithm, so as toexpand the action group, such that an optimal coordination APG can bedetermined more accurately.

Fourth Embodiment

FIG. 7 is a block diagram showing function modules of an electronicapparatus 400 for wireless communications according to anotherembodiment of the present disclosure. Besides the units shown in FIG. 2,the electronic apparatus 400 may further include a transceiving unit401. The transceiving unit 401 is configured to receive one or more ofposition information and communication quality requirement of the userand one or more of position information, information of maximum emittingpower and a predetermined network overhead threshold of the accesspoint, and transmit information of the determined coordination accesspoint group to the access point.

The transceiving unit 401, for example, may be implemented by acommunication interface. The communication interface, for example, mayinclude a network interface, or an antenna and a transceiving circuitry,and the like. In addition, although not shown in FIG. 7, the electronicapparatus 400 may further includes the storage unit 103 described withreference to FIG. 4, the grouping unit 201 described with reference toFIG. 5, and the estimating unit 301 described with reference to FIG. 6.

The above information received by the transceiving unit 401 is used todetermine and update the coordination APG for the user. For example, ina case that the wireless network topology structure being taken as thestate changes, the transceiving unit 401 re-acquires the aboveinformation.

In addition, the transceiving unit 401 is further configured to receiveinformation of the actual communication quality of the user. Forexample, in a case that the state changes, the user reports actualcommunication quality, for example, an actual SINR and an actual utilityvalue, obtained by performing the determined action in a state beforechanging, to the electronic apparatus 400.

The position information and communication quality requirement of theuser may be provided to the transceiving unit 401 directly or via theaccess point.

For convenience of understanding, FIG. 8 shows a schematic diagram of aninformation procedure between a user (UE), an access point (AP) and aspectrum management device (for example, the SC or the SAS) in a casethat the electronic apparatus 400 is arranged on the spectrum managementdevice.

First, the UE requests an AP for coordination communication from thespectrum management device, and reports position information andinformation of the communication quality requirement such as the SINRthreshold for the user to the spectrum management device. The AP reportsthe position information, the information of the maximum emitting powerand the predetermined network overhead threshold of the AP to thespectrum management device. In a case that a position of the AP isfixed, the AP may report the position information of the AP only in aprocess of system initialization. As described in the above, the UE maydirectly report related information to the spectrum management device.Alternatively, the UE may report the related information to the spectrummanagement device via the AP. In the latter case, the informationreported by the AP further includes the position information and theinformation of the communication quality requirement of the user.

After acquiring the above mentioned various pieces of information, thespectrum management device selects the coordination APG for the user.The spectrum management device may select an action having the largestQ-value by using the Q-learning reinforcement learning algorithm asdescribed in detail in the first embodiment, so as to determine thecoordination APG for each user. It should be noted that, in a case thatevaluation matrixes for multiple states are stored in the spectrummanagement device, and if the current state is included in the storedstates, the action may be selected based on the store evaluationmatrixes without repeating the reinforcement learning algorithm.

Next, the spectrum management device transmits information of thedetermined coordination APG to the AP, so that the AP can coordinatewith the UE based on the information.

In the example shown in FIG. 8, if only the position of the UE canchange, the UE, for example, periodically determines whether theposition of the UE changes. In a case that the position of the UEchanges or the change reaches a certain degree, which indicates that thewireless network topology structure changes, the UE is required torequest a new coordination APG. In this case, the UE reports the changedposition information of the UE to the spectrum management device. The UEfurther reports an actual utility value and an actual SINR of the UEthat are obtained by performing the determined action in a state beforechanging, to the spectrum management device. The spectrum managementdevice updates, based on the actual utility value and the actual SINRthat are provided by the UE, a Q-value of the action determined in theprevious state. In addition, the spectrum management device furtherreselects, based on current position information of the UE, an action tobe performed in the current state. For example, as described in theabove, the spectrum management device may perform the selection byperforming the Q-learning reinforcement learning algorithm.Alternatively, in a case that an evaluation matrix for the current stateis stored in the spectrum management device, the action to be performedmay be selected by searching for the evaluation matrix. Similarly, thespectrum management device transmits information of the determinedcoordination APG to the AP, so that the AP can coordinate with the UEbased on the information.

It should be understood that, the information procedure shown in FIG. 8is only exemplary rather than restrictive.

In order to further illustrate details and effects of the technicalsolutions of the present disclosure, two simulation instances applyingthe technical solutions of the present disclosure are described in thefollowing. First, the first simulation instance is described withreference to FIGS. 9 to 13. FIG. 9 is a schematic diagram showing asimulation scenario of the first simulation instance, where a triangledenotes UE, a square denotes an AP, and a dashed line with an arrowindicates a movement trace of one UE. FIG. 9 shows four differentpositions of the UE, which denote states S₁, S₂, S₃ and S₄ respectively.

Parameters used in simulation are listed as follows: an operationfrequency of 3.5 GHz, a channel bandwidth of 10 MHz, 3 UEs, emittingpower of OdBm, 16 APs, an SINR threshold for the UE of 7 dB, a noisefigure of 5 dB at a receiver of the UE, 10 generations of populationevolution in the genetic algorithm, a crossover ratio of 0.7, a mutationratio of 0.1, 10 individuals, and a Hamming distance threshold of 5.

In the state S₁, the UE uploads the position information and theinformation of communication quality requirement to the spectrummanagement device. The AP uploads the position information, theinformation of the maximum emitting power, and the Hamming distancethreshold to the spectrum management device. The spectrum managementdevice generates some preliminary actions, and generates, based on thepreliminary actions, new actions by using the genetic algorithm. Thepreliminary actions and the new actions form an action matrix A₁ . Anexample of the action matrix A₁ in the state S₁ is shown on the leftside in FIG. 10, where each row denotes one action, that is, onecoordination relationship between the AP and the UE, and 18 actions areshown. Each action is represented by a binary sequence including 48bits. There are M (M=16 in this embodiment) APs. Bits 1 to M denote acoordination relationship between a user 1 and the APs, and Bits M+1 to2M denote a coordination relationship between a user 2 and the APs, andso on.

The spectrum management matrix generates a Q-value matrix Q(S₁ , A₁ )corresponding to the matrix A₁ by using the above mentioned Q-learningalgorithm, as shown on the right side in FIG. 10. The Q-value matrix iscalculated by using the above equations (3) to (5) and (7) to (8). Inthis simulation instance, the state S₁ is an initial state, and PH(S₁ ,A₁ ) is a null matrix.

The spectrum management device selects an action corresponding to amaximum value in the Q-value matrix, for example, an action 15, andnotifies the AP to coordinate with the UE based on this action. FIG. 11is a schematic diagram showing a result after the action 15 isperformed. A UE and an AP that are circled with lines having the sameline type have the coordination relationship.

Next, the state S₁ changes to the state S₂ due to movement of the UE .The UE uploads new position information, and an actual SINR and anactual utility value that are obtained by performing the action 15 inthe state S₁ to the spectrum management device. The spectrum managementdevice calculates an actual degree of meeting communication qualityrequirement obtained by performing the action 15 based on thisinformation and by using the equation (9), and updates a value of R_15in R(S₁ , A₁ ) by using equation (10). In equation (10), γ is set to bezero.

The spectrum management device updates the action matrix in the state S₁by using the genetic algorithm, to obtain an action matrix A₂ in thestate S₂, as shown on the left side in FIG. 12. Similarly, the spectrummanagement device generates a Q-value matrix Q(S₂ , A₂ ) correspondingto A₂ by using the above mentioned Q-learning algorithm , as shown onthe right side in FIG. 12. The Q-value matrix is calculated by using theabove equations (3) to (5) and (7) to (8).

The spectrum management device selects an action corresponding to amaximum value in the Q-value matrix, for example, an action 11, andnotifies the AP to coordinate with the UE based on this action. FIG. 13is a schematic diagram showing a result after the action 11 isperformed. The UE and the AP that are circled with lines having the sameline type have the coordination relationship.

When the state successively changes into the states S₃ and S₄, thespectrum management device performs operation similar to that in thestate S₂, which is not repeated herein.

The second simulation instance is described below with reference toFIGS. 14 to 17. FIGS. 14 and 15 respectively show two simulationscenarios of the second simulation instance, where a dashed line with anarrow indicates a movement trace of a UE. In a simulation scenario 1shown in FIG. 14, the UE reciprocates along the dashed line, thus thestate changes from state S₁ to S9 and then to S1, that is S₁→S₉→S₁. In asimulation scenario 2 shown in FIG. 15, the UE moves cyclically along arectangle formed by a dashed line. It is assumed that positions of UEsother than the above UE and APs remain unchanged in the two scenarios.An initial state is the state S₁ in a case of t=0, other states can beobtained based on positions to which the UE moves.

Parameters used in the simulation are listed as follows: an operationfrequency of 28 GHz, a channel bandwidth of 10 MHz, 6 UEs, emittingpower of 0 dBm, 60 APs, an SINR threshold for the UE of 7 dB, a noisefigure of 5 dB at a receiver of the UE, 10 generations of populationevolution in the genetic algorithm, a crossover ratio of 0.7, a mutationratio of 0.1, 10 individuals, a beam width of π/4, and a Hammingdistance threshold of 5 in the simulation scenario 1, and a Hammingdistance threshold of 10 in the simulation scenario 2.

Besides the APG selection based on the reinforcement learning algorithmprovided in the present disclosure, for comparison, the followingsimulation for the APG selection based a comparison algorithm isdescribed with respect to the scenario 1: a new action is acquired byusing the genetic algorithm, but the action is determined based on onlya switch threshold, that is, a Hamming distance threshold T_(d) for theAPG reselection. FIG. 16 is a comparison diagram of a cumulativedistribution function (CDF) of a user satisfaction rate obtained basedon the simulation scenario 1, where a solid line denotes a CDF curvecorresponding to the reinforcement learning algorithm, an upper dashedline denotes a CDF curve corresponding to the comparison algorithm in acase of a Hamming distance threshold being 5, and a lower dashed linedenotes a CDF curve corresponding to the comparison algorithm in a caseof a Hamming distance threshold being 20. It can be seen that,performance based on the reinforcement learning algorithm is superior tothat based on the comparison algorithm.

FIG. 17 shows ratios of meeting communication quality requirement suchas QoS requirement of a user, in a case that the UE moves along arectangle trace in the simulation scenario 2 with different numbers ofrounds. It can be seen that, with the number of rounds increasing, asatisfaction rate of the user increases correspondingly, that is, theeffect of the reinforcement learning algorithm is increasinglysignificant over time.

Fifth Embodiment

In the process of describing the electronic apparatus for wirelesscommunications in the embodiments described above, obviously, someprocessing and methods are also disclosed. Hereinafter, an overview ofthe methods is given without repeating some details disclosed above.However, it should be noted that, although the methods are disclosed ina process of describing the electronic apparatus for wirelesscommunications, the methods do not certainly employ or are not certainlyexecuted by the aforementioned components. For example, the embodimentsof the electronic apparatus for wireless communications may be partiallyor completely implemented with hardware and/or firmware, the method forwireless communications described below may be executed by acomputer-executable program completely, although the hardware and/orfirmware of the electronic apparatus for wireless communications canalso be used in the methods.

FIG. 18 is a flowchart of a method for wireless communications accordingto an embodiment of the present disclosure. As shown in FIG. 18, themethod for wireless communications includes: determining a coordinationAPG for a user within a predetermined range, by taking a wirelessnetwork topology structure of a wireless network as a state (S12); andre-determining a coordination APG for the user in response to a changeof the wireless network topology structure (S7). The wireless networktopology structure may include a distribution of users and adistribution of access points. In an example, the coordination APG maybe determined by using a reinforcement learning algorithm in step S12.

In step S12, a coordination relationship between the user and an accesspoint is taken as an action in the reinforcement learning algorithm, andwith respect to each action, an evaluation of the action is calculatedbased on a degree of meeting communication quality requirement of theuser and a resulting network overhead when performing the action. Forexample, the coordination APG for the user in a current state isdetermined based on an action with the highest evaluation. The actionwith the highest evaluation is an action when being performed results inthe highest degree of meeting the communication quality requirement ofthe user and the lowest network overhead, compared with other actions.

In an example, the degree of meeting the communication qualityrequirement of each user is calculated by using a signal to interferenceand noise ratio threshold for the user and an estimated signal tointerference and noise ratio of the user. The degree of meeting thecommunication quality requirement of the user may include a utilityvalue of all users and a cost value of not meeting the signal tointerference and noise ratio of the user. The utility value of the usersis calculated from a utility function. The utility function is anon-linear function of a ratio of the estimated signal to interferenceand noise ratio of the user to the signal to interference and noiseratio threshold for the user. The cost value depends on a differencebetween the signal to interference and noise ratio threshold for a userand the estimated signal to interference and noise ratio of the user.

In addition, with respect to each action, a difference between thisaction and an action determined in a previous state may be used as thenetwork overhead produced by this action. The action may be representedby a binarization matrix of the coordination relationship. The networkoverhead may be represented by a Hamming distance between actions. Thenetwork overhead produced when performing the action may be taken intoconsideration only when the network overhead exceeds a predeterminedoverhead threshold.

In addition, as shown in dashed line blocks in FIG. 18, the above methodmay further include: receiving one or more of position information andcommunication quality requirement of the user, and one or more ofposition information, information of maximum emitting power and apredetermined network overhead threshold of the access point (S11), andtransmitting information of the determined coordination access pointgroup to the access point (S13). The information received in step S11 isused in calculation in step S12.

The above method may further include a step S14 of storing, with respectto each state, each action in this state in association with anevaluation calculated with respect to the action, as an evaluationmatrix. In this way, in a case that state changes and there is anevaluation matrix for the changed state, an action to be performed inthe changed state can be determined based on content of the evaluationmatrix.

In addition, the above method may further include a step S15 ofreceiving information of actual communication quality of the user in acase that the state changes. The above method further includes a stepS16 of updating stored evaluation of the action performed in theprevious state using information of the actual communication quality ofthe user when performing the determined action in the previous state,that is, updating the content of the evaluation matrix.

For example, a portion of the evaluation of the action performed in theprevious state which is related to the degree of meeting thecommunication quality requirement of the user may be replaced with afollowing calculated value: a weighted sum of the actual degree ofmeeting the communication quality requirement of the user in theprevious state and the estimated highest degree of meeting thecommunication quality requirement of the user in the current state.

In addition, although not shown in FIG. 18, the above method may furtherinclude: acquiring, in each state, the action by grouping the accesspoints taking the user as a center and selecting the coordination accesspoint group for the user within a group of the user. For example, thegrouping may be performed based on a Euclidean distance between the userand the access point. In this case, the coordination access point groupfor the user is randomly selected within the group of the user, and thecoordination relationship between the user and the access point whichmeets a predetermined condition is taken as the action. Thepredetermined condition, for example, may include one or more of: thecommunication quality for each user meets the communication qualityrequirement of the user; and the network overhead produced when usingthis coordination relationship relative to an action determined in theprevious state does not exceed a predetermined overhead threshold.

In addition, with respect to each state, a new action may be estimatedbased on a preliminarily acquired action when acquiring actions. Forexample, the new action may be estimated by using a genetic algorithm.An action estimated by using the genetic algorithm may be taken as thenew action only when the action meets the above predetermined condition.

It should be noted that, details of the above method are described inthe first to fourth embodiments, and are not repeated herein.

The technology of the present disclosure can be applied to variousproducts. For example, each of the electronic apparatus 100 to 400 maybe implemented as various servers, such as a tower server, arack-mounted server, and a blade server. Each of the electronicapparatus 100 to 400 may be a control module (such as an integratedcircuitry module including a single die, and a card or blade inserted ina groove of a blade server) mounted on a server.

Application Example Regarding a Server

FIG. 19 is a block diagram showing an example of a schematicconfiguration of a server 700 to which the technology of the presentdisclosure may be applied. The server 700 includes a processor 701, amemory 702, a storage 703, a network interface 704, and a bus 706.

The processor 701 may be, for example, a central processing unit (CPU)or a digital signal processor (DSP), and controls functions of theserver 700. The memory 702 includes random access memory (RAM) and readonly memory (ROM), and stores a program that is executed by theprocessor 701 and data. The storage 703 may include a storage mediumsuch as a semiconductor memory and a hard disk.

The network interface 704 is a wired communication interface forconnecting the server 700 to a wired communication network 705. Thewired communication network 705 may be a core network such as an EvolvedPacket Core (EPC), or a packet data network (PDN) such as the Internet.

The bus 706 connects the processor 701, the memory 702, the storage 703,and the network interface 704 to each other. The bus 706 may include twoor more buses (such as a high speed bus and a low speed bus) each ofwhich has different speed.

In the server 700 shown in FIG. 19, the determining unit 101, theupdating unit 102, the grouping unit 201, the estimating unit 301 andthe like that are respectively described with reference to FIGS. 2, 5, 6may be implemented by the processor 701. The storage unit 103 describedwith reference to FIG. 4 may be implemented by, for example, the memory702 or the storage 703. The transceiving unit 401 described withreference to FIG. 7 may be implemented by, for example, the networkinterface 704. A part of functions of the storage unit 103 and thetransceiving unit 401 may also be implemented by the processor 701. Forexample, the processor 701 may perform selecting and updating of thecoordination APG by performing functions of the determining unit 101,the updating unit 102 and the like.

The basic principle of the present disclosure has been described abovein conjunction with particular embodiments. However, as can beappreciated by those ordinarily skilled in the art, all or any of thesteps or components of the method and apparatus according to thedisclosure can be implemented with hardware, firmware, software or acombination thereof in any computing device (including a processor, astorage medium, etc.) or a network of computing devices by thoseordinarily skilled in the art in light of the disclosure of thedisclosure and making use of their general circuit designing knowledgeor general programming skills.

Moreover, the present disclosure further discloses a program product inwhich machine-readable instruction codes are stored. The aforementionedmethods according to the embodiments can be implemented when theinstruction codes are read and executed by a machine.

Accordingly, a memory medium for carrying the program product in whichmachine-readable instruction codes are stored is also covered in thepresent disclosure. The memory medium includes but is not limited tosoft disc, optical disc, magnetic optical disc, memory card, memorystick and the like.

In the case where the present disclosure is realized with software orfirmware, a program constituting the software is installed in a computerwith a dedicated hardware structure (e.g. the general computer 2000shown in FIG. 20) from a storage medium or network, wherein the computeris capable of implementing various functions when installed with variousprograms.

In FIG. 20, a central processing unit (CPU) 2001 executes variousprocessing according to a program stored in a read-only memory (ROM)2002 or a program loaded to a random access memory (RAM) 2003 from amemory section 2008. The data needed for the various processing of theCPU 2001 may be stored in the RAM 2003 as needed. The CPU 2001, the ROM2002 and the RAM 2003 are linked with each other via a bus 2004. Aninput/output interface 2005 is also linked to the bus 2004.

The following components are linked to the input/output interface 2005:an input section 2006 (including keyboard, mouse and the like), anoutput section 2007 (including displays such as a cathode ray tube(CRT), a liquid crystal display (LCD), a loudspeaker and the like), amemory section 2008 (including hard disc and the like), and acommunication section 2009 (including a network interface card such as aLAN card, modem and the like).

The communication section 2009 performs communication processing via anetwork such as the Internet. A driver 2010 may also be linked to theinput/output interface 2005, if needed. If needed, a removable medium2011, for example, a magnetic disc, an optical disc, a magnetic opticaldisc, a semiconductor memory and the like, may be installed in thedriver 2010, so that the computer program read therefrom is installed inthe memory section 2008 as appropriate.

In the case where the foregoing series of processing is achieved throughsoftware, programs forming the software are installed from a networksuch as the Internet or a memory medium such as the removable medium2011.

It should be appreciated by those skilled in the art that the memorymedium is not limited to the removable medium 2011 shown in FIG. 20,which has program stored therein and is distributed separately from theapparatus so as to provide the programs to users. The removable medium2011 may be, for example, a magnetic disc (including floppy disc(registered trademark)), a compact disc (including compact discread-only memory (CD-ROM) and digital versatile disc (DVD), a magnetooptical disc (including mini disc (MD)(registered trademark)), and asemiconductor memory. Alternatively, the memory medium may be the harddiscs included in ROM 2002 and the memory section 2008 in which programsare stored, and can be distributed to users along with the device inwhich they are incorporated.

To be further noted, in the apparatus, method and system according tothe present disclosure, the respective components or steps can bedecomposed and/or recombined. These decompositions and/or recombinationsshall be regarded as equivalent solutions of the disclosure. Moreover,the above series of processing steps can naturally be performedtemporally in the sequence as described above but will not be limitedthereto, and some of the steps can be performed in parallel orindependently from each other.

Finally, to be further noted, the term “include”, “comprise” or anyvariant thereof is intended to encompass nonexclusive inclusion so thata process, method, article or device including a series of elementsincludes not only those elements but also other elements which have beennot listed definitely or an element(s) inherent to the process, method,article or device. Moreover, the expression “comprising a(n) . . . ” inwhich an element is defined will not preclude presence of an additionalidentical element(s) in a process, method, article or device comprisingthe defined element(s)” unless further defined.

Although the embodiments of the present disclosure have been describedabove in detail in connection with the drawings, it shall be appreciatedthat the embodiments as described above are merely illustrative ratherthan limitative of the present disclosure. Those skilled in the art canmake various modifications and variations to the above embodimentswithout departing from the spirit and scope of the present disclosure.Therefore, the scope of the present disclosure is defined merely by theappended claims and their equivalents.

1. An electronic apparatus for wireless communications, comprising:processing circuitry, configured to: determine a coordination accesspoint group for a user within a predetermined range, by taking awireless network topology structure of a wireless network as a state;and re-determine a coordination access point group for the user inresponse to a change of the wireless network topology structure, whereinthe wireless network topology structure comprises a distribution ofusers and a distribution of access points.
 2. The electronic apparatusaccording to claim 1, wherein the processing circuitry is configured totake a coordination relationship between the user and an access point asan action, and with respect to each action, calculate, based on a degreeof meeting communication quality requirement of the user and a resultingnetwork overhead when performing the action, an evaluation of theaction, wherein the processing circuitry is configured to determine thecoordination access point group for the user in current state based onan action with the highest evaluation.
 3. The electronic apparatusaccording to claim 2, the action with the highest evaluation is anaction when being performed results in the highest degree of meeting thecommunication quality requirement of the user and the lowest networkoverhead, compared with other actions.
 4. The electronic apparatusaccording to claim 2, wherein the processing circuitry is configured todetermine, using a signal to interference and noise ratio threshold foreach user and an estimated signal to interference and noise ratio of theuser, the degree of meeting the communication quality requirement of theuser.
 5. The electronic apparatus according to claim 4, wherein, thedegree of meeting the communication quality requirement of the usercomprises a utility value of all users and a cost value of not meetingthe signal to interference and noise ratio of the users, wherein theutility value of the user is calculated from a utility function, theutility function is a non-linear function of a ratio of the estimatedsignal to interference and noise ratio to the signal to interference andnoise ratio threshold, and the cost value depends on a differencebetween the signal to interference and noise ratio threshold for a userand the estimated signal to interference and noise ratio of the user. 6.The electronic apparatus according to claim 2, wherein the processingcircuitry is configured to use, with respect to each action, adifference between this action and an action determined in a previousstate as the network overhead produced by this action, the networkoverhead being represented by a Hamming distance between actions.
 7. Theelectronic apparatus according to claim 2, wherein the network overheadproduced when performing the action is taken into consideration when thenetwork overhead exceeds a predetermined overhead threshold.
 8. Theelectronic apparatus according to claim 2, further comprising a storage,configured to store, with respect to each state, each action in thisstate in association with an evaluation calculated with respect to theaction as an evaluation matrix.
 9. The electronic apparatus according toclaim 8, wherein the processing circuitry is further configured todetermine, when the state changes and in a case that there is anevaluation matrix for the changed state, an action to be performed inthe changed state based on content of the evaluation matrix.
 10. Theelectronic apparatus according to claim 8, wherein the processingcircuitry is configured to update, when the state changes, an evaluationfor the action performed in the previous state which is stored in thestorage using information of actual communication quality of the userwhen performing the determined action in the previous state.
 11. Theelectronic apparatus according to claim 10, wherein the processingcircuitry is configured to replace a portion of the evaluation of theaction performed in the previous state which is related to the degree ofmeeting the communication quality requirement of the user with afollowing calculated value: a weighted sum of the actual degree ofmeeting the communication quality requirement of the user in theprevious state and the estimated highest degree of meeting thecommunication quality requirement of the user in the current state. 12.The electronic apparatus according to claim 2, wherein the processingcircuitry is configured to: in a state, acquire the action by groupingthe access points taking the user as a center and selecting thecoordination access point group for the user within a group of the user.13. The electronic apparatus according to claim 12, wherein theprocessing circuitry is configured to perform the grouping according toa Euclidean distance between the user and the access point.
 14. Theelectronic apparatus according to claim 12, wherein the processingcircuitry is configured to randomly select the coordination access pointgroup for the user within the group of the user and take thecoordination relationship between the user and the access point whichmeets a predetermined condition as the action.
 15. The electronicapparatus according to claim 14, wherein the predetermined conditioncomprises one or more of: the communication quality for each user meetsthe communication quality requirement of the user; the network overheadproduced when using this coordination relationship relative to an actiondetermined in the previous state does not exceed a predeterminedoverhead threshold.
 16. The electronic apparatus according to claim 2,wherein the processing circuitry is further configured to estimate, withrespect to each state, a new action based on a preliminarily acquiredaction.
 17. The electronic apparatus according to claim 16, wherein theprocessing circuitry is configured to estimate the new action using agenetic algorithm, and take an action estimated by the genetic algorithmas the new action only when the action satisfies a predeterminedcondition.
 18. The electronic apparatus according to claim 1, furthercomprising: a transceiving unit, configured to receive one or more ofposition information and communication quality requirement of the userand one or more of position information, information of maximum emittingpower and a predetermined network overhead threshold of the accesspoint, and transmit information of the determined coordination accesspoint group to the access point.
 19. The electronic apparatus accordingto claim 18, wherein the transceiving unit is further configured toreceive information of the actual communication quality of the user. 20.A method for wireless communications, comprising: determining acoordination access point group for a user within a predetermined range,by taking a wireless network topology structure of a wireless network asa state; and re-determining a coordination access point group for theuser in response to a change of the wireless network topology structure,wherein the wireless network topology structure comprises a distributionof the users and a distribution of the access points.
 21. (canceled)